Goto

Collaborating Authors

 joint estimator



Multi-task Learning of Order-Consistent Causal Graphs

Neural Information Processing Systems

We consider the problem of discovering $K$ related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a $l_1/l_2$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures. To allow practical implementation, we design a continuous optimization problem whose optimizer is the same as the joint estimator and can be approximated efficiently by an iterative algorithm.



Multi-task Learning of Order-Consistent Causal Graphs

Neural Information Processing Systems

We consider the problem of discovering K related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a l_1/l_2 -regularized maximum likelihood estimator (MLE) for learning K linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures.


AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models

Yun, Kwan, Hong, Seokhyeon, Kim, Chaelin, Noh, Junyong

arXiv.org Artificial Intelligence

Despite recent advancements in learning-based motion in-betweening, a key limitation has been overlooked: the requirement for character-specific datasets. In this work, we introduce AnyMoLe, a novel method that addresses this limitation by leveraging video diffusion models to generate motion in-between frames for arbitrary characters without external data. Our approach employs a two-stage frame generation process to enhance contextual understanding. Furthermore, to bridge the domain gap between real-world and rendered character animations, we introduce ICAdapt, a fine-tuning technique for video diffusion models. Additionally, we propose a ``motion-video mimicking'' optimization technique, enabling seamless motion generation for characters with arbitrary joint structures using 2D and 3D-aware features. AnyMoLe significantly reduces data dependency while generating smooth and realistic transitions, making it applicable to a wide range of motion in-betweening tasks.


Multi-task Learning of Order-Consistent Causal Graphs

Neural Information Processing Systems

We consider the problem of discovering K related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a l_1/l_2 -regularized maximum likelihood estimator (MLE) for learning K linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures.


cc1aa436277138f61cda703991069eaf-Paper.pdf

Neural Information Processing Systems

We study the problem of estimating continuous quantities, such as prices, probabilities, and point spreads, using a crowdsourcing approach. A challenging aspect of combining the crowd's answers is that workers' reliabilities and biases are usually unknown and highly diverse. Control items with known answers can be used to evaluate workers' performance, and hence improve the combined results on the target items with unknown answers. This raises the problem of how many control items to use when the total number of items each workers can answer is limited: more control items evaluates the workers better, but leaves fewer resources for the target items that are of direct interest, and vice versa. We give theoretical results for this problem under different scenarios, and provide a simple rule of thumb for crowdsourcing practitioners. As a byproduct, we also provide theoretical analysis of the accuracy of different consensus methods.


Joint Learning of Linear Time-Invariant Dynamical Systems

Modi, Aditya, Faradonbeh, Mohamad Kazem Shirani, Tewari, Ambuj, Michailidis, George

arXiv.org Machine Learning

The problem of identifying the transition matrices in linear time-invariant dynamical systems (LTIDS) has been studied extensively in the literature (Lai and Wei, 1983; Kailath et al., 2000; Buchmann and Chan, 2007). Recent works establish finite-time rates for accurately learning the dynamics in different online and offline settings (Faradonbeh et al., 2018; Simchowitz et al., 2018; Sarkar and Rakhlin, 2019). The existing results are established assuming that the goal is to identify the transition matrix of a single dynamical system. However, in many areas where LTIDS models (as in (1) below) are used, such as macroeconomics (Stock and Watson, 2016), functional genomics (Fujita et al., 2007), and neuroimaging (Seth et al., 2015), one observes multiple dynamical systems and needs to estimate the transition matrices for all of them jointly. Further, the underlying dynamical systems share commonalities, but also exhibit heterogeneity. For example, (Skripnikov and Michailidis, 2019a) analyze economic indicators of US states whose local economies share a strong manufacturing base. Moreover, in time course genetics experiments, one is interested in understanding the dynamics and drivers of gene expressions across related animal or cell line populations (Basu et al., 2015), while in neuroimaging, one has access to data from multiple subjects that suffer from the same disease (Skripnikov and Michailidis, 2019b). In all these settings, there are remarkable similarities in the dynamics of the systems, but some degree of heterogeneity is also present. Hence, it becomes natural to pursue a joint learning strategy of the systems'


Multi-task Learning of Order-Consistent Causal Graphs

Chen, Xinshi, Sun, Haoran, Ellington, Caleb, Xing, Eric, Song, Le

arXiv.org Machine Learning

We consider the problem of discovering $K$ related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a $l_1/l_2$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures. To allow practical implementation, we design a continuous optimization problem whose optimizer is the same as the joint estimator and can be approximated efficiently by an iterative algorithm. We validate the theoretical analysis and the effectiveness of the joint estimator in experiments.


Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Xu, Kan, Zhao, Xuanyi, Bastani, Hamsa, Bastani, Osbert

arXiv.org Machine Learning

Sparse regression has recently been applied to enable transfer learning from very limited data. We study an extension of this approach to unsupervised learning -- in particular, learning word embeddings from unstructured text corpora using low-rank matrix factorization. Intuitively, when transferring word embeddings to a new domain, we expect that the embeddings change for only a small number of words -- e.g., the ones with novel meanings in that domain. We propose a novel group-sparse penalty that exploits this sparsity to perform transfer learning when there is very little text data available in the target domain -- e.g., a single article of text. We prove generalization bounds for our algorithm. Furthermore, we empirically evaluate its effectiveness, both in terms of prediction accuracy in downstream tasks as well as the interpretability of the results.